Overview

Dataset statistics

Number of variables25
Number of observations34493
Missing cells310496
Missing cells (%)36.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory26.1 MiB
Average record size in memory793.2 B

Variable types

Categorical10
Numeric12
Unsupported3

Alerts

ClaseVehiculo__c has constant value "99999" Constant
TipoVehiculo__c has constant value "99999" Constant
n_prod_prev is highly correlated with total_siniestros and 2 other fieldsHigh correlation
total_siniestros is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with n_prod_prev and 3 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
Activos__c is highly correlated with AnnualRevenue and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with Activos__c and 1 other fieldsHigh correlation
MontoAnual__c is highly correlated with total_pagado_smmlvHigh correlation
EgresosAnuales__c is highly correlated with Activos__c and 1 other fieldsHigh correlation
total_siniestros is highly correlated with total_pagado_smmlv and 1 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with total_siniestros and 1 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with total_siniestros and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with EgresosAnuales__cHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
total_siniestros is highly correlated with total_pagado_smmlv and 1 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with total_siniestros and 2 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with total_siniestros and 1 other fieldsHigh correlation
AnnualRevenue is highly correlated with EgresosAnuales__cHigh correlation
MontoAnual__c is highly correlated with total_pagado_smmlvHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
FechaInicioVigencia__ctrim is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
CodigoTipoAsegurado__c is highly correlated with TipoVehiculo__c and 4 other fieldsHigh correlation
TipoVehiculo__c is highly correlated with FechaInicioVigencia__ctrim and 8 other fieldsHigh correlation
ciudad_name is highly correlated with CodigoTipoAsegurado__c and 2 other fieldsHigh correlation
tipo_ramo_name is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
tipo_prod_desc is highly correlated with TipoVehiculo__c and 2 other fieldsHigh correlation
Genero__pc is highly correlated with CodigoTipoAsegurado__c and 3 other fieldsHigh correlation
churn is highly correlated with FechaInicioVigencia__ctrim and 2 other fieldsHigh correlation
ClaseVehiculo__c is highly correlated with FechaInicioVigencia__ctrim and 8 other fieldsHigh correlation
EstadoCivil__pc is highly correlated with CodigoTipoAsegurado__c and 3 other fieldsHigh correlation
PuntoVenta__c is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
tipo_ramo_name is highly correlated with tipo_prod_desc and 1 other fieldsHigh correlation
tipo_prod_desc is highly correlated with tipo_ramo_name and 1 other fieldsHigh correlation
NumeroPoliza__c is highly correlated with tipo_ramo_name and 1 other fieldsHigh correlation
FechaInicioVigencia__ctrim is highly correlated with churnHigh correlation
churn is highly correlated with FechaInicioVigencia__ctrimHigh correlation
n_prod_prev is highly correlated with PuntoVenta__c and 3 other fieldsHigh correlation
total_siniestros is highly correlated with PuntoVenta__c and 2 other fieldsHigh correlation
total_pagado_smmlv is highly correlated with PuntoVenta__c and 3 other fieldsHigh correlation
anios_ultimo_siniestro is highly correlated with n_prod_prev and 2 other fieldsHigh correlation
AnnualRevenue is highly correlated with EgresosAnuales__cHigh correlation
MontoAnual__c is highly correlated with edadHigh correlation
OtrosIngresos__c is highly correlated with anios_ultimo_siniestroHigh correlation
EgresosAnuales__c is highly correlated with AnnualRevenueHigh correlation
EstadoCivil__pc is highly correlated with Genero__pcHigh correlation
Genero__pc is highly correlated with EstadoCivil__pcHigh correlation
edad is highly correlated with MontoAnual__cHigh correlation
MarcaVehiculo__c has 34493 (100.0%) missing values Missing
MdeloVehiculo__c has 34493 (100.0%) missing values Missing
n_prod_prev has 1130 (3.3%) missing values Missing
total_siniestros has 18691 (54.2%) missing values Missing
total_pagado_smmlv has 18691 (54.2%) missing values Missing
anios_ultimo_siniestro has 18691 (54.2%) missing values Missing
Activos__c has 14407 (41.8%) missing values Missing
AnnualRevenue has 14407 (41.8%) missing values Missing
MontoAnual__c has 34473 (99.9%) missing values Missing
OtrosIngresos__c has 15501 (44.9%) missing values Missing
Profesion__pc has 34493 (100.0%) missing values Missing
EgresosAnuales__c has 14407 (41.8%) missing values Missing
EstadoCivil__pc has 14147 (41.0%) missing values Missing
Genero__pc has 14147 (41.0%) missing values Missing
ciudad_name has 14147 (41.0%) missing values Missing
edad has 14178 (41.1%) missing values Missing
Activos__c is highly skewed (γ1 = 96.7181334) Skewed
AnnualRevenue is highly skewed (γ1 = 22.54403697) Skewed
OtrosIngresos__c is highly skewed (γ1 = 98.99998221) Skewed
MarcaVehiculo__c is an unsupported type, check if it needs cleaning or further analysis Unsupported
MdeloVehiculo__c is an unsupported type, check if it needs cleaning or further analysis Unsupported
Profesion__pc is an unsupported type, check if it needs cleaning or further analysis Unsupported
total_pagado_smmlv has 1697 (4.9%) zeros Zeros
OtrosIngresos__c has 18238 (52.9%) zeros Zeros

Reproduction

Analysis started2022-06-07 22:28:51.733354
Analysis finished2022-06-07 22:34:15.833355
Duration5 minutes and 24.1 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

CodigoTipoAsegurado__c
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
1
33658 
4
 
624
3
 
145
2
 
66

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters34493
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row4
3rd row2
4th row4
5th row1

Common Values

ValueCountFrequency (%)
133658
97.6%
4624
 
1.8%
3145
 
0.4%
266
 
0.2%

Length

2022-06-07T17:34:15.876357image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:15.950354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
133658
97.6%
4624
 
1.8%
3145
 
0.4%
266
 
0.2%

Most occurring characters

ValueCountFrequency (%)
133658
97.6%
4624
 
1.8%
3145
 
0.4%
266
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number34493
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
133658
97.6%
4624
 
1.8%
3145
 
0.4%
266
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Common34493
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
133658
97.6%
4624
 
1.8%
3145
 
0.4%
266
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII34493
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
133658
97.6%
4624
 
1.8%
3145
 
0.4%
266
 
0.2%

PuntoVenta__c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct268
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4164.068855
Minimum5
Maximum9977
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:16.026354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile120
Q11048
median2707
Q39474
95-th percentile9721
Maximum9977
Range9972
Interquartile range (IQR)8426

Descriptive statistics

Standard deviation3682.154548
Coefficient of variation (CV)0.8842684108
Kurtosis-1.260203819
Mean4164.068855
Median Absolute Deviation (MAD)1705
Skewness0.6623917598
Sum143631227
Variance13558262.12
MonotonicityNot monotonic
2022-06-07T17:34:16.117854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
97217160
20.8%
33011871
 
5.4%
10481735
 
5.0%
32021643
 
4.8%
24661433
 
4.2%
35021198
 
3.5%
4041170
 
3.4%
3011160
 
3.4%
9474692
 
2.0%
501553
 
1.6%
Other values (258)15878
46.0%
ValueCountFrequency (%)
5133
 
0.4%
832
 
0.1%
954
 
0.2%
14140
 
0.4%
16121
 
0.4%
23504
1.5%
25126
 
0.4%
2692
 
0.3%
1001
 
< 0.1%
1029
 
< 0.1%
ValueCountFrequency (%)
99778
 
< 0.1%
99731
 
< 0.1%
997175
0.2%
996951
0.1%
99679
 
< 0.1%
99435
 
< 0.1%
984953
0.2%
98386
 
< 0.1%
98301
 
< 0.1%
98218
 
< 0.1%

tipo_ramo_name
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
automoviles
34397 
responsabilidad civil
 
96

Length

Max length21
Median length11
Mean length11.02783173
Min length11

Characters and Unicode

Total characters380383
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowautomoviles
2nd rowautomoviles
3rd rowautomoviles
4th rowautomoviles
5th rowautomoviles

Common Values

ValueCountFrequency (%)
automoviles34397
99.7%
responsabilidad civil96
 
0.3%

Length

2022-06-07T17:34:16.198854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:16.267354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
automoviles34397
99.4%
responsabilidad96
 
0.3%
civil96
 
0.3%

Most occurring characters

ValueCountFrequency (%)
o68890
18.1%
i34781
9.1%
a34589
9.1%
s34589
9.1%
l34589
9.1%
e34493
9.1%
v34493
9.1%
m34397
9.0%
u34397
9.0%
t34397
9.0%
Other values (7)768
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter380287
> 99.9%
Space Separator96
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o68890
18.1%
i34781
9.1%
a34589
9.1%
s34589
9.1%
l34589
9.1%
e34493
9.1%
v34493
9.1%
m34397
9.0%
u34397
9.0%
t34397
9.0%
Other values (6)672
 
0.2%
Space Separator
ValueCountFrequency (%)
96
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin380287
> 99.9%
Common96
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o68890
18.1%
i34781
9.1%
a34589
9.1%
s34589
9.1%
l34589
9.1%
e34493
9.1%
v34493
9.1%
m34397
9.0%
u34397
9.0%
t34397
9.0%
Other values (6)672
 
0.2%
Common
ValueCountFrequency (%)
96
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII380383
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o68890
18.1%
i34781
9.1%
a34589
9.1%
s34589
9.1%
l34589
9.1%
e34493
9.1%
v34493
9.1%
m34397
9.0%
u34397
9.0%
t34397
9.0%
Other values (7)768
 
0.2%

tipo_prod_desc
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
automoviles
34397 
profesionales medicos
 
56
directores y administradores
 
38
servidores publicos
 
2

Length

Max length28
Median length11
Mean length11.03542748
Min length11

Characters and Unicode

Total characters380645
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowautomoviles
2nd rowautomoviles
3rd rowautomoviles
4th rowautomoviles
5th rowautomoviles

Common Values

ValueCountFrequency (%)
automoviles34397
99.7%
profesionales medicos56
 
0.2%
directores y administradores38
 
0.1%
servidores publicos2
 
< 0.1%

Length

2022-06-07T17:34:16.329854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:16.405854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
automoviles34397
99.3%
profesionales56
 
0.2%
medicos56
 
0.2%
directores38
 
0.1%
y38
 
0.1%
administradores38
 
0.1%
servidores2
 
< 0.1%
publicos2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o69042
18.1%
s34685
9.1%
e34683
9.1%
i34627
9.1%
a34529
9.1%
m34491
9.1%
t34473
9.1%
l34455
9.1%
v34399
9.0%
u34399
9.0%
Other values (9)862
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter380511
> 99.9%
Space Separator134
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o69042
18.1%
s34685
9.1%
e34683
9.1%
i34627
9.1%
a34529
9.1%
m34491
9.1%
t34473
9.1%
l34455
9.1%
v34399
9.0%
u34399
9.0%
Other values (8)728
 
0.2%
Space Separator
ValueCountFrequency (%)
134
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin380511
> 99.9%
Common134
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o69042
18.1%
s34685
9.1%
e34683
9.1%
i34627
9.1%
a34529
9.1%
m34491
9.1%
t34473
9.1%
l34455
9.1%
v34399
9.0%
u34399
9.0%
Other values (8)728
 
0.2%
Common
ValueCountFrequency (%)
134
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII380645
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o69042
18.1%
s34685
9.1%
e34683
9.1%
i34627
9.1%
a34529
9.1%
m34491
9.1%
t34473
9.1%
l34455
9.1%
v34399
9.0%
u34399
9.0%
Other values (9)862
 
0.2%

ClaseVehiculo__c
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
99999
34493 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters172465
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row99999
2nd row99999
3rd row99999
4th row99999
5th row99999

Common Values

ValueCountFrequency (%)
9999934493
100.0%

Length

2022-06-07T17:34:16.473355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:16.539354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
9999934493
100.0%

Most occurring characters

ValueCountFrequency (%)
9172465
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number172465
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9172465
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common172465
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9172465
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII172465
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9172465
100.0%

MarcaVehiculo__c
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing34493
Missing (%)100.0%
Memory size269.6 KiB

MdeloVehiculo__c
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing34493
Missing (%)100.0%
Memory size269.6 KiB

TipoVehiculo__c
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
99999
34493 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters172465
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row99999
2nd row99999
3rd row99999
4th row99999
5th row99999

Common Values

ValueCountFrequency (%)
9999934493
100.0%

Length

2022-06-07T17:34:16.593854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:16.658854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
9999934493
100.0%

Most occurring characters

ValueCountFrequency (%)
9172465
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number172465
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9172465
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common172465
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9172465
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII172465
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9172465
100.0%

NumeroPoliza__c
Real number (ℝ≥0)

HIGH CORRELATION

Distinct27316
Distinct (%)79.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3036916.702
Minimum1001678
Maximum3173059
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:16.726354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1001678
5-th percentile3003457
Q13009494
median3040235
Q33076091
95-th percentile3129858.4
Maximum3173059
Range2171381
Interquartile range (IQR)66597

Descriptive statistics

Standard deviation146889.6846
Coefficient of variation (CV)0.04836803213
Kurtosis170.554896
Mean3036916.702
Median Absolute Deviation (MAD)31979
Skewness-12.56946917
Sum1.047523678 × 1011
Variance2.157657945 × 1010
MonotonicityNot monotonic
2022-06-07T17:34:17.801356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30046706
 
< 0.1%
30079536
 
< 0.1%
30047646
 
< 0.1%
30079236
 
< 0.1%
30079346
 
< 0.1%
30047726
 
< 0.1%
30078896
 
< 0.1%
30078946
 
< 0.1%
30079006
 
< 0.1%
30079246
 
< 0.1%
Other values (27306)34433
99.8%
ValueCountFrequency (%)
10016781
< 0.1%
10026571
< 0.1%
10034041
< 0.1%
10036931
< 0.1%
10038061
< 0.1%
10040141
< 0.1%
10056521
< 0.1%
10059451
< 0.1%
10060251
< 0.1%
10061791
< 0.1%
ValueCountFrequency (%)
31730591
< 0.1%
31730561
< 0.1%
31728551
< 0.1%
31728301
< 0.1%
31728031
< 0.1%
31719511
< 0.1%
31717901
< 0.1%
31707231
< 0.1%
31695121
< 0.1%
31695101
< 0.1%

FechaInicioVigencia__ctrim
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.1 MiB
03-2018
5568 
02-2018
5322 
02-2019
4876 
01-2019
4705 
03-2020
4148 
Other values (5)
9874 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters241451
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row01-2018
2nd row01-2018
3rd row01-2018
4th row01-2018
5th row01-2018

Common Values

ValueCountFrequency (%)
03-20185568
16.1%
02-20185322
15.4%
02-20194876
14.1%
01-20194705
13.6%
03-20204148
12.0%
01-20184044
11.7%
02-20213829
11.1%
01-20211972
 
5.7%
03-201922
 
0.1%
02-20207
 
< 0.1%

Length

2022-06-07T17:34:17.882357image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:17.967354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
03-20185568
16.1%
02-20185322
15.4%
02-20194876
14.1%
01-20194705
13.6%
03-20204148
12.0%
01-20184044
11.7%
02-20213829
11.1%
01-20211972
 
5.7%
03-201922
 
0.1%
02-20207
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
073141
30.3%
258483
24.2%
141059
17.0%
-34493
14.3%
814934
 
6.2%
39738
 
4.0%
99603
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number206958
85.7%
Dash Punctuation34493
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
073141
35.3%
258483
28.3%
141059
19.8%
814934
 
7.2%
39738
 
4.7%
99603
 
4.6%
Dash Punctuation
ValueCountFrequency (%)
-34493
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common241451
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
073141
30.3%
258483
24.2%
141059
17.0%
-34493
14.3%
814934
 
6.2%
39738
 
4.0%
99603
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII241451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
073141
30.3%
258483
24.2%
141059
17.0%
-34493
14.3%
814934
 
6.2%
39738
 
4.0%
99603
 
4.0%

churn
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.9 MiB
1
24191 
0
10302 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters34493
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
124191
70.1%
010302
29.9%

Length

2022-06-07T17:34:18.056854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:18.123854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
124191
70.1%
010302
29.9%

Most occurring characters

ValueCountFrequency (%)
124191
70.1%
010302
29.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number34493
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
124191
70.1%
010302
29.9%

Most occurring scripts

ValueCountFrequency (%)
Common34493
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
124191
70.1%
010302
29.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII34493
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
124191
70.1%
010302
29.9%

n_prod_prev
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct10
Distinct (%)< 0.1%
Missing1130
Missing (%)3.3%
Infinite0
Infinite (%)0.0%
Mean3.104037407
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:18.175854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q35
95-th percentile14
Maximum16
Range15
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.275958966
Coefficient of variation (CV)1.055386433
Kurtosis6.498626545
Mean3.104037407
Median Absolute Deviation (MAD)1
Skewness2.502886697
Sum103560
Variance10.73190714
MonotonicityNot monotonic
2022-06-07T17:34:18.235854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
114608
42.4%
57479
21.7%
26060
17.6%
32301
 
6.7%
141035
 
3.0%
16772
 
2.2%
4747
 
2.2%
8263
 
0.8%
686
 
0.2%
712
 
< 0.1%
(Missing)1130
 
3.3%
ValueCountFrequency (%)
114608
42.4%
26060
17.6%
32301
 
6.7%
4747
 
2.2%
57479
21.7%
686
 
0.2%
712
 
< 0.1%
8263
 
0.8%
141035
 
3.0%
16772
 
2.2%
ValueCountFrequency (%)
16772
 
2.2%
141035
 
3.0%
8263
 
0.8%
712
 
< 0.1%
686
 
0.2%
57479
21.7%
4747
 
2.2%
32301
 
6.7%
26060
17.6%
114608
42.4%

total_siniestros
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct73
Distinct (%)0.5%
Missing18691
Missing (%)54.2%
Infinite0
Infinite (%)0.0%
Mean1640.038286
Minimum1
Maximum3466
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:18.319855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median704
Q33466
95-th percentile3466
Maximum3466
Range3465
Interquartile range (IQR)3464

Descriptive statistics

Standard deviation1683.612314
Coefficient of variation (CV)1.026568909
Kurtosis-1.950909277
Mean1640.038286
Median Absolute Deviation (MAD)703
Skewness0.1456458098
Sum25915885
Variance2834550.425
MonotonicityNot monotonic
2022-06-07T17:34:18.409355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34667219
 
20.9%
13738
 
10.8%
7041035
 
3.0%
21012
 
2.9%
136772
 
2.2%
3429
 
1.2%
4278
 
0.8%
5262
 
0.8%
92207
 
0.6%
7168
 
0.5%
Other values (63)682
 
2.0%
(Missing)18691
54.2%
ValueCountFrequency (%)
13738
10.8%
21012
 
2.9%
3429
 
1.2%
4278
 
0.8%
5262
 
0.8%
685
 
0.2%
7168
 
0.5%
828
 
0.1%
957
 
0.2%
1064
 
0.2%
ValueCountFrequency (%)
34667219
20.9%
29724
 
< 0.1%
13111
 
< 0.1%
7701
 
< 0.1%
7041035
 
3.0%
5435
 
< 0.1%
2151
 
< 0.1%
1791
 
< 0.1%
1453
 
< 0.1%
1401
 
< 0.1%

total_pagado_smmlv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct2308
Distinct (%)14.6%
Missing18691
Missing (%)54.2%
Infinite0
Infinite (%)0.0%
Mean11868.21025
Minimum0
Maximum64793.74852
Zeros1697
Zeros (%)4.9%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:18.506354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110.99240408
median5779.809377
Q324921.89139
95-th percentile24921.89139
Maximum64793.74852
Range64793.74852
Interquartile range (IQR)24910.89899

Descriptive statistics

Standard deviation12093.24543
Coefficient of variation (CV)1.018961172
Kurtosis-1.863889317
Mean11868.21025
Median Absolute Deviation (MAD)5779.809377
Skewness0.155327507
Sum187541458.4
Variance146246585
MonotonicityNot monotonic
2022-06-07T17:34:18.592854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24921.891397219
 
20.9%
01697
 
4.9%
5779.8093771035
 
3.0%
1308.359896772
 
2.2%
506.8921899207
 
0.6%
41.5516750387
 
0.3%
133.455468371
 
0.2%
205.163836154
 
0.2%
794.172988332
 
0.1%
289.533928829
 
0.1%
Other values (2298)4599
 
13.3%
(Missing)18691
54.2%
ValueCountFrequency (%)
01697
4.9%
0.011439345982
 
< 0.1%
0.023414137341
 
< 0.1%
0.058392310336
 
< 0.1%
0.066517251211
 
< 0.1%
0.0686159912
 
< 0.1%
0.074825444894
 
< 0.1%
0.08072274669
 
< 0.1%
0.098305006642
 
< 0.1%
0.11388984151
 
< 0.1%
ValueCountFrequency (%)
64793.748524
 
< 0.1%
24921.891397219
20.9%
5779.8093771035
 
3.0%
3629.204432
 
< 0.1%
2962.7489645
 
< 0.1%
2501.8172042
 
< 0.1%
1308.359896772
 
2.2%
1295.6617331
 
< 0.1%
1194.0444192
 
< 0.1%
1124.7842813
 
< 0.1%

anios_ultimo_siniestro
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct1140
Distinct (%)7.2%
Missing18691
Missing (%)54.2%
Infinite0
Infinite (%)0.0%
Mean0.66014862
Minimum0.002739726027
Maximum10.69863014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:18.687855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.002739726027
5-th percentile0.002739726027
Q10.002739726027
median0.008219178082
Q31.238356164
95-th percentile2.764383562
Maximum10.69863014
Range10.69589041
Interquartile range (IQR)1.235616438

Descriptive statistics

Standard deviation1.042983795
Coefficient of variation (CV)1.579922708
Kurtosis5.679649006
Mean0.66014862
Median Absolute Deviation (MAD)0.005479452055
Skewness1.884821477
Sum10431.66849
Variance1.087815197
MonotonicityNot monotonic
2022-06-07T17:34:18.777854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0027397260277223
 
20.9%
0.06027397261044
 
3.0%
0.008219178082798
 
2.3%
0.04383561644209
 
0.6%
0.432876712390
 
0.3%
0.0876712328890
 
0.3%
0.153424657566
 
0.2%
0.0767123287746
 
0.1%
1.68493150732
 
0.1%
1.44931506832
 
0.1%
Other values (1130)6172
 
17.9%
(Missing)18691
54.2%
ValueCountFrequency (%)
0.0027397260277223
20.9%
0.00547945205513
 
< 0.1%
0.008219178082798
 
2.3%
0.0109589041113
 
< 0.1%
0.0136986301413
 
< 0.1%
0.0164383561611
 
< 0.1%
0.019178082194
 
< 0.1%
0.024657534259
 
< 0.1%
0.027397260272
 
< 0.1%
0.03013698638
 
< 0.1%
ValueCountFrequency (%)
10.698630142
 
< 0.1%
10.002739736
< 0.1%
9.0958904111
 
< 0.1%
7.9232876711
 
< 0.1%
7.5095890412
 
< 0.1%
7.2821917811
 
< 0.1%
7.0821917812
 
< 0.1%
6.972602741
 
< 0.1%
6.813698631
 
< 0.1%
6.6684931512
 
< 0.1%

Activos__c
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED

Distinct3500
Distinct (%)17.4%
Missing14407
Missing (%)41.8%
Infinite0
Infinite (%)0.0%
Mean426319534.1
Minimum0
Maximum1 × 1012
Zeros20
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:18.870854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21250000
Q160000000
median120000000
Q3250000000
95-th percentile994554000
Maximum1 × 1012
Range1 × 1012
Interquartile range (IQR)190000000

Descriptive statistics

Standard deviation1.009835822 × 1010
Coefficient of variation (CV)23.68729887
Kurtosis9560.011934
Mean426319534.1
Median Absolute Deviation (MAD)79053000
Skewness96.7181334
Sum8.563054163 × 1012
Variance1.019768387 × 1020
MonotonicityNot monotonic
2022-06-07T17:34:18.966356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000000001003
 
2.9%
80000000753
 
2.2%
150000000711
 
2.1%
200000000686
 
2.0%
50000000640
 
1.9%
120000000557
 
1.6%
60000000544
 
1.6%
40000000482
 
1.4%
30000000443
 
1.3%
90000000417
 
1.2%
Other values (3490)13850
40.2%
(Missing)14407
41.8%
ValueCountFrequency (%)
020
 
0.1%
166
0.2%
23
 
< 0.1%
202
 
< 0.1%
403
 
< 0.1%
501
 
< 0.1%
801
 
< 0.1%
1003
 
< 0.1%
1041
 
< 0.1%
3502
 
< 0.1%
ValueCountFrequency (%)
1 × 10122
< 0.1%
1 × 10112
< 0.1%
5.835 × 10101
 
< 0.1%
5.43154931 × 10101
 
< 0.1%
4.479 × 10101
 
< 0.1%
4.0078653 × 10104
< 0.1%
2.8125029 × 10102
< 0.1%
2.43 × 10101
 
< 0.1%
2.0615603 × 10101
 
< 0.1%
2.0178092 × 10101
 
< 0.1%

AnnualRevenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
SKEWED

Distinct3424
Distinct (%)17.0%
Missing14407
Missing (%)41.8%
Infinite0
Infinite (%)0.0%
Mean240635915.3
Minimum0
Maximum8.63 × 1010
Zeros6
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:19.062855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8600000
Q126400000
median45000000
Q384000000
95-th percentile586071843.2
Maximum8.63 × 1010
Range8.63 × 1010
Interquartile range (IQR)57600000

Descriptive statistics

Standard deviation1545629592
Coefficient of variation (CV)6.423104338
Kurtosis825.72905
Mean240635915.3
Median Absolute Deviation (MAD)22000000
Skewness22.54403697
Sum4.833412995 × 1012
Variance2.388970835 × 1018
MonotonicityNot monotonic
2022-06-07T17:34:19.152354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
360000001089
 
3.2%
60000000845
 
2.4%
24000000770
 
2.2%
30000000767
 
2.2%
48000000630
 
1.8%
18000000442
 
1.3%
50000000393
 
1.1%
40000000374
 
1.1%
12000000330
 
1.0%
42000000315
 
0.9%
Other values (3414)14131
41.0%
(Missing)14407
41.8%
ValueCountFrequency (%)
06
 
< 0.1%
127
0.1%
202
 
< 0.1%
522301
 
< 0.1%
2500001
 
< 0.1%
3500001
 
< 0.1%
5000003
 
< 0.1%
6000001
 
< 0.1%
7000001
 
< 0.1%
80000012
< 0.1%
ValueCountFrequency (%)
8.63 × 10101
 
< 0.1%
6.6728 × 10101
 
< 0.1%
4.1610143 × 10102
 
< 0.1%
3.6110425 × 10105
< 0.1%
2.576590619 × 10102
 
< 0.1%
2.469016564 × 10103
< 0.1%
2.3626255 × 10102
 
< 0.1%
2.3246 × 10101
 
< 0.1%
2.2159396 × 10102
 
< 0.1%
2.1401724 × 10102
 
< 0.1%

MontoAnual__c
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct9
Distinct (%)45.0%
Missing34473
Missing (%)99.9%
Infinite0
Infinite (%)0.0%
Mean3422.05
Minimum0
Maximum21605
Zeros9
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:19.228854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3
Q33000
95-th percentile21605
Maximum21605
Range21605
Interquartile range (IQR)3000

Descriptive statistics

Standard deviation6716.982315
Coefficient of variation (CV)1.962853352
Kurtosis4.346263153
Mean3422.05
Median Absolute Deviation (MAD)3
Skewness2.280328803
Sum68441
Variance45117851.42
MonotonicityNot monotonic
2022-06-07T17:34:19.286355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
09
 
< 0.1%
30003
 
< 0.1%
216052
 
< 0.1%
100001
 
< 0.1%
55251
 
< 0.1%
6001
 
< 0.1%
51
 
< 0.1%
11
 
< 0.1%
1001
 
< 0.1%
(Missing)34473
99.9%
ValueCountFrequency (%)
09
< 0.1%
11
 
< 0.1%
51
 
< 0.1%
1001
 
< 0.1%
6001
 
< 0.1%
30003
 
< 0.1%
55251
 
< 0.1%
100001
 
< 0.1%
216052
 
< 0.1%
ValueCountFrequency (%)
216052
 
< 0.1%
100001
 
< 0.1%
55251
 
< 0.1%
30003
 
< 0.1%
6001
 
< 0.1%
1001
 
< 0.1%
51
 
< 0.1%
11
 
< 0.1%
09
< 0.1%

OtrosIngresos__c
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct217
Distinct (%)1.1%
Missing15501
Missing (%)44.9%
Infinite0
Infinite (%)0.0%
Mean2248604.142
Minimum0
Maximum8400000000
Zeros18238
Zeros (%)52.9%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:19.364854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum8400000000
Range8400000000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation69064469.41
Coefficient of variation (CV)30.71437437
Kurtosis11607.31434
Mean2248604.142
Median Absolute Deviation (MAD)0
Skewness98.99998221
Sum4.270548986 × 1010
Variance4.769900936 × 1015
MonotonicityNot monotonic
2022-06-07T17:34:19.454854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
018238
52.9%
1200000052
 
0.2%
1000000031
 
0.1%
2400000029
 
0.1%
3000000023
 
0.1%
600000022
 
0.1%
2000000019
 
0.1%
720000017
 
< 0.1%
1800000017
 
< 0.1%
1500000016
 
< 0.1%
Other values (207)528
 
1.5%
(Missing)15501
44.9%
ValueCountFrequency (%)
018238
52.9%
1830001
 
< 0.1%
2000004
 
< 0.1%
2280001
 
< 0.1%
2390001
 
< 0.1%
2440001
 
< 0.1%
2740007
 
< 0.1%
3000002
 
< 0.1%
3780002
 
< 0.1%
3788401
 
< 0.1%
ValueCountFrequency (%)
84000000001
 
< 0.1%
16648010005
< 0.1%
9360540001
 
< 0.1%
9287370002
 
< 0.1%
8624235002
 
< 0.1%
4368380001
 
< 0.1%
3981889681
 
< 0.1%
2683460003
< 0.1%
2504230002
 
< 0.1%
2338460002
 
< 0.1%

Profesion__pc
Unsupported

MISSING
REJECTED
UNSUPPORTED

Missing34493
Missing (%)100.0%
Memory size269.6 KiB

EgresosAnuales__c
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2698
Distinct (%)13.4%
Missing14407
Missing (%)41.8%
Infinite0
Infinite (%)0.0%
Mean162201749.3
Minimum0
Maximum3.6967344 × 1010
Zeros10
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:19.547854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3000000
Q112300000
median24000000
Q350000000
95-th percentile441252500
Maximum3.6967344 × 1010
Range3.6967344 × 1010
Interquartile range (IQR)37700000

Descriptive statistics

Standard deviation1034423361
Coefficient of variation (CV)6.377387207
Kurtosis457.3774663
Mean162201749.3
Median Absolute Deviation (MAD)14000000
Skewness18.32164113
Sum3.257984337 × 1012
Variance1.07003169 × 1018
MonotonicityNot monotonic
2022-06-07T17:34:19.639858image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120000001056
 
3.1%
30000000918
 
2.7%
24000000860
 
2.5%
18000000813
 
2.4%
20000000797
 
2.3%
36000000493
 
1.4%
15000000468
 
1.4%
10000000451
 
1.3%
40000000428
 
1.2%
25000000393
 
1.1%
Other values (2688)13409
38.9%
(Missing)14407
41.8%
ValueCountFrequency (%)
010
 
< 0.1%
1114
0.3%
108
 
< 0.1%
181
 
< 0.1%
1001
 
< 0.1%
2042
 
< 0.1%
200001
 
< 0.1%
700001
 
< 0.1%
720003
 
< 0.1%
876991
 
< 0.1%
ValueCountFrequency (%)
3.6967344 × 10102
 
< 0.1%
3.0868341 × 10105
< 0.1%
2.166849912 × 10102
 
< 0.1%
2.1322738 × 10102
 
< 0.1%
1.9746108 × 10102
 
< 0.1%
1.923885007 × 10103
< 0.1%
1.8035021 × 10101
 
< 0.1%
1.629516502 × 10102
 
< 0.1%
1.4868 × 10102
 
< 0.1%
1.2720892 × 10101
 
< 0.1%

EstadoCivil__pc
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct8
Distinct (%)< 0.1%
Missing14147
Missing (%)41.0%
Memory size1.7 MiB
SOLTERO
9638 
CASADO
8774 
OTRO
1444 
UNIDO
 
295
VIUDO
 
73
Other values (3)
 
122

Length

Max length10
Median length8
Mean length6.32709132
Min length3

Characters and Unicode

Total characters128731
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCASADO
2nd rowCASADO
3rd rowOTRO
4th rowCASADO
5th rowSOLTERO

Common Values

ValueCountFrequency (%)
SOLTERO9638
27.9%
CASADO8774
25.4%
OTRO1444
 
4.2%
UNIDO295
 
0.9%
VIUDO73
 
0.2%
SEPARADO69
 
0.2%
DIVORCIADO42
 
0.1%
N A11
 
< 0.1%
(Missing)14147
41.0%

Length

2022-06-07T17:34:19.880354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:19.966358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
soltero9638
47.3%
casado8774
43.1%
otro1444
 
7.1%
unido295
 
1.4%
viudo73
 
0.4%
separado69
 
0.3%
divorciado42
 
0.2%
n11
 
0.1%
a11
 
0.1%

Most occurring characters

ValueCountFrequency (%)
O31459
24.4%
S18481
14.4%
A17739
13.8%
R11193
 
8.7%
T11082
 
8.6%
E9707
 
7.5%
L9638
 
7.5%
D9295
 
7.2%
C8816
 
6.8%
I452
 
0.4%
Other values (5)869
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter128720
> 99.9%
Space Separator11
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O31459
24.4%
S18481
14.4%
A17739
13.8%
R11193
 
8.7%
T11082
 
8.6%
E9707
 
7.5%
L9638
 
7.5%
D9295
 
7.2%
C8816
 
6.8%
I452
 
0.4%
Other values (4)858
 
0.7%
Space Separator
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin128720
> 99.9%
Common11
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
O31459
24.4%
S18481
14.4%
A17739
13.8%
R11193
 
8.7%
T11082
 
8.6%
E9707
 
7.5%
L9638
 
7.5%
D9295
 
7.2%
C8816
 
6.8%
I452
 
0.4%
Other values (4)858
 
0.7%
Common
ValueCountFrequency (%)
11
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII128731
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O31459
24.4%
S18481
14.4%
A17739
13.8%
R11193
 
8.7%
T11082
 
8.6%
E9707
 
7.5%
L9638
 
7.5%
D9295
 
7.2%
C8816
 
6.8%
I452
 
0.4%
Other values (5)869
 
0.7%

Genero__pc
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing14147
Missing (%)41.0%
Memory size1.7 MiB
MASCULINO
15404 
FEMENINO
4934 
N A
 
8

Length

Max length9
Median length9
Mean length8.755136145
Min length3

Characters and Unicode

Total characters178132
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMASCULINO
2nd rowFEMENINO
3rd rowFEMENINO
4th rowMASCULINO
5th rowMASCULINO

Common Values

ValueCountFrequency (%)
MASCULINO15404
44.7%
FEMENINO4934
 
14.3%
N A8
 
< 0.1%
(Missing)14147
41.0%

Length

2022-06-07T17:34:20.047855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-06-07T17:34:20.124855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
masculino15404
75.7%
femenino4934
 
24.2%
n8
 
< 0.1%
a8
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N25280
14.2%
M20338
11.4%
I20338
11.4%
O20338
11.4%
A15412
8.7%
S15404
8.6%
C15404
8.6%
U15404
8.6%
L15404
8.6%
E9868
 
5.5%
Other values (2)4942
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter178124
> 99.9%
Space Separator8
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N25280
14.2%
M20338
11.4%
I20338
11.4%
O20338
11.4%
A15412
8.7%
S15404
8.6%
C15404
8.6%
U15404
8.6%
L15404
8.6%
E9868
 
5.5%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin178124
> 99.9%
Common8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N25280
14.2%
M20338
11.4%
I20338
11.4%
O20338
11.4%
A15412
8.7%
S15404
8.6%
C15404
8.6%
U15404
8.6%
L15404
8.6%
E9868
 
5.5%
Common
ValueCountFrequency (%)
8
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII178132
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N25280
14.2%
M20338
11.4%
I20338
11.4%
O20338
11.4%
A15412
8.7%
S15404
8.6%
C15404
8.6%
U15404
8.6%
L15404
8.6%
E9868
 
5.5%
Other values (2)4942
 
2.8%

ciudad_name
Categorical

HIGH CORRELATION
MISSING

Distinct22
Distinct (%)0.1%
Missing14147
Missing (%)41.0%
Memory size1.7 MiB
otras
15833 
BOGOTÁ D.C.
 
1514
MEDELLIN
 
727
CALI
 
556
CARTAGENA
 
179
Other values (17)
 
1537

Length

Max length13
Median length5
Mean length5.807578885
Min length4

Characters and Unicode

Total characters118161
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowotras
2nd rowBOGOTÁ D.C.
3rd rowotras
4th rowBARRANQUILLA
5th rowotras

Common Values

ValueCountFrequency (%)
otras15833
45.9%
BOGOTÁ D.C.1514
 
4.4%
MEDELLIN727
 
2.1%
CALI556
 
1.6%
CARTAGENA179
 
0.5%
VILLAVICENCIO173
 
0.5%
YOPAL172
 
0.5%
PASTO166
 
0.5%
BUCARAMANGA162
 
0.5%
MANIZALES129
 
0.4%
Other values (12)735
 
2.1%
(Missing)14147
41.0%

Length

2022-06-07T17:34:20.193355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
otras15833
72.0%
bogotá1514
 
6.9%
d.c1514
 
6.9%
medellin727
 
3.3%
cali556
 
2.5%
cartagena179
 
0.8%
villavicencio173
 
0.8%
yopal172
 
0.8%
pasto166
 
0.8%
bucaramanga162
 
0.7%
Other values (17)1003
 
4.6%

Most occurring characters

ValueCountFrequency (%)
o15833
13.4%
r15833
13.4%
a15833
13.4%
s15833
13.4%
t15833
13.4%
O3748
 
3.2%
A3621
 
3.1%
C3072
 
2.6%
.3028
 
2.6%
L2951
 
2.5%
Other values (21)22576
19.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter79165
67.0%
Uppercase Letter34315
29.0%
Other Punctuation3028
 
2.6%
Space Separator1653
 
1.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O3748
10.9%
A3621
10.6%
C3072
 
9.0%
L2951
 
8.6%
I2458
 
7.2%
E2387
 
7.0%
D2371
 
6.9%
T2025
 
5.9%
G1957
 
5.7%
B1822
 
5.3%
Other values (14)7903
23.0%
Lowercase Letter
ValueCountFrequency (%)
o15833
20.0%
r15833
20.0%
a15833
20.0%
s15833
20.0%
t15833
20.0%
Other Punctuation
ValueCountFrequency (%)
.3028
100.0%
Space Separator
ValueCountFrequency (%)
1653
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin113480
96.0%
Common4681
 
4.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o15833
14.0%
r15833
14.0%
a15833
14.0%
s15833
14.0%
t15833
14.0%
O3748
 
3.3%
A3621
 
3.2%
C3072
 
2.7%
L2951
 
2.6%
I2458
 
2.2%
Other values (19)18465
16.3%
Common
ValueCountFrequency (%)
.3028
64.7%
1653
35.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII116420
98.5%
None1741
 
1.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o15833
13.6%
r15833
13.6%
a15833
13.6%
s15833
13.6%
t15833
13.6%
O3748
 
3.2%
A3621
 
3.1%
C3072
 
2.6%
.3028
 
2.6%
L2951
 
2.5%
Other values (18)20835
17.9%
None
ValueCountFrequency (%)
Á1584
91.0%
Ú111
 
6.4%
É46
 
2.6%

edad
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct9598
Distinct (%)47.2%
Missing14178
Missing (%)41.1%
Infinite0
Infinite (%)0.0%
Mean49.10082151
Minimum1.397260274
Maximum122.5123288
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size269.6 KiB
2022-06-07T17:34:20.273854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1.397260274
5-th percentile28.65616438
Q138.85616438
median48.17808219
Q358.83013699
95-th percentile71.91041096
Maximum122.5123288
Range121.1150685
Interquartile range (IQR)19.9739726

Descriptive statistics

Standard deviation13.53019398
Coefficient of variation (CV)0.2755594216
Kurtosis-0.04205794412
Mean49.10082151
Median Absolute Deviation (MAD)9.942465753
Skewness0.2629557078
Sum997483.189
Variance183.0661491
MonotonicityNot monotonic
2022-06-07T17:34:20.367855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
46.6904109680
 
0.2%
47.5178082277
 
0.2%
42.4602739738
 
0.1%
43.6876712322
 
0.1%
41.5835616421
 
0.1%
36.5095890420
 
0.1%
60.5561643819
 
0.1%
47.6904109616
 
< 0.1%
54.9424657516
 
< 0.1%
71.526027415
 
< 0.1%
Other values (9588)19991
58.0%
(Missing)14178
41.1%
ValueCountFrequency (%)
1.3972602741
< 0.1%
3.3753424661
< 0.1%
3.4082191781
< 0.1%
4.2328767121
< 0.1%
4.2575342471
< 0.1%
4.3369863011
< 0.1%
4.3506849322
< 0.1%
4.3972602741
< 0.1%
4.4301369862
< 0.1%
4.7643835621
< 0.1%
ValueCountFrequency (%)
122.51232887
< 0.1%
104.95616442
 
< 0.1%
103.07397263
< 0.1%
100.01917811
 
< 0.1%
96.364383562
 
< 0.1%
94.309589042
 
< 0.1%
92.805479452
 
< 0.1%
92.495890411
 
< 0.1%
92.254794522
 
< 0.1%
91.082191782
 
< 0.1%

Interactions

2022-06-07T17:33:59.790354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:28:55.449855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:24.001354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:31:59.945354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:23.642355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:35.646855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:48.104855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.921854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:15.305854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.157854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.955354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.934359image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:01.186355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:28:57.510854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:43.036355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:01.420854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:24.785354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:36.797854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:50.034354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:02.306854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:16.685856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.229854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:32.311855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:46.338357image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:13.751354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:18.386855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:30:21.620353image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:21.369854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:34.822355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.278854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.120855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:14.494354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.371854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.316354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.141855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:58.944858image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:13.830354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:19.040854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:30:39.941359image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:21.456354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:34.908854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.365855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.203355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:14.573354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.450354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.384854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.220354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.027854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:13.911354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:19.578354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:30:49.218356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:21.542854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:34.997354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.457356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.287855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:14.654354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.531355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.449359image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.301357image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.112355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:13.992353image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:20.122355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:30:58.133855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:21.627856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:35.083354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.542356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.368856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:14.736353image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.611356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.510353image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.382354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.197354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:14.077854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:20.667354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:31:07.361855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:21.711355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:35.165355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.622854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.446856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:14.822354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.695355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.571354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.467360image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.286358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:14.158354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:21.311854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:31:17.765354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:21.792354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:35.247858image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.705354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.527354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:14.904854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.776354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.639855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.546858image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.371854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:14.233355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:21.944854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:31:28.063355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:21.867855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:35.325355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.783854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.603855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:14.980854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.849853image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.702854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.621354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.453354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:14.296854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:22.025854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:31:28.179354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:21.935854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:35.388855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.845854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.666354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:15.050354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.914854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.760354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.683855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.520858image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:14.375854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:22.656855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:31:38.391356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:23.469854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:35.468854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:47.927355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.744354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:15.129358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:29.993358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.820854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.760855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.604853image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:34:14.463857image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:29:23.301359image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:31:49.292358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:23.555857image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:35.555356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:32:48.013855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:00.830354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:15.215857image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.079355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:30.886354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:44.844854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-06-07T17:33:59.695354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-06-07T17:34:20.452856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-06-07T17:34:20.589354image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-06-07T17:34:20.724854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-06-07T17:34:20.856855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-06-07T17:34:20.980854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-06-07T17:34:14.666856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-06-07T17:34:15.182854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-06-07T17:34:15.487855image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-06-07T17:34:15.703355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

CodigoTipoAsegurado__cPuntoVenta__ctipo_ramo_nametipo_prod_descClaseVehiculo__cMarcaVehiculo__cMdeloVehiculo__cTipoVehiculo__cNumeroPoliza__cFechaInicioVigencia__ctrimchurnn_prod_prevtotal_siniestrostotal_pagado_smmlvanios_ultimo_siniestroActivos__cAnnualRevenueMontoAnual__cOtrosIngresos__cProfesion__pcEgresosAnuales__cEstadoCivil__pcGenero__pcciudad_nameedad
027002automovilesautomoviles99999NaNNaN99999101016601-201811.078.0188.7642670.005479NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
147002automovilesautomoviles99999NaNNaN99999100997001-201801.01.04.4213800.427397NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
227002automovilesautomoviles99999NaNNaN99999101043001-201801.0543.02962.7489640.008219NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
347002automovilesautomoviles99999NaNNaN99999101068201-201801.02.021.7811511.967123NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
413202automovilesautomoviles99999NaNNaN99999306519301-201811.0NaNNaNNaN29990000.014400000.0NaN0.0NaN2600000.0CASADOMASCULINOotras31.605479
513202automovilesautomoviles99999NaNNaN99999306533301-201811.0NaNNaNNaN37625000.070800000.0NaN0.0NaN42000000.0CASADOFEMENINOBOGOTÁ D.C.44.312329
613202automovilesautomoviles99999NaNNaN99999306513001-201811.0NaNNaNNaNNaNNaNNaNNaNNaNNaNOTROFEMENINOotras30.454795
713202automovilesautomoviles99999NaNNaN99999306577301-201811.0NaNNaNNaN200000000.0132000000.0NaN0.0NaN60000000.0CASADOMASCULINOBARRANQUILLA26.315068
813202automovilesautomoviles99999NaNNaN99999305448101-201811.0NaNNaNNaN95000000.022800000.0NaN0.0NaN9000000.0SOLTEROMASCULINOotras25.887671
913202automovilesautomoviles99999NaNNaN99999307612601-2018014.0704.05779.8093770.060274NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN

Last rows

CodigoTipoAsegurado__cPuntoVenta__ctipo_ramo_nametipo_prod_descClaseVehiculo__cMarcaVehiculo__cMdeloVehiculo__cTipoVehiculo__cNumeroPoliza__cFechaInicioVigencia__ctrimchurnn_prod_prevtotal_siniestrostotal_pagado_smmlvanios_ultimo_siniestroActivos__cAnnualRevenueMontoAnual__cOtrosIngresos__cProfesion__pcEgresosAnuales__cEstadoCivil__pcGenero__pcciudad_nameedad
3448313301responsabilidad civilprofesionales medicos99999NaNNaN99999100369302-20211NaNNaNNaNNaN324578000.083160000.0NaN0.0NaN1.0OTROMASCULINOotras61.052055
3448413301responsabilidad civildirectores y administradores99999NaNNaN99999102759702-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3448511048responsabilidad civildirectores y administradores99999NaNNaN99999102009902-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3448611048responsabilidad civildirectores y administradores99999NaNNaN99999102355502-20211NaN2.01.4151210.394521NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3448713301responsabilidad civilprofesionales medicos99999NaNNaN99999102740002-20211NaNNaNNaNNaN18000000.060000000.0NaN0.0NaN58000000.0CASADOMASCULINOBOGOTÁ D.C.NaN
3448813301responsabilidad civildirectores y administradores99999NaNNaN99999102740302-2021114.0704.05779.8093770.060274NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3448933202responsabilidad civilprofesionales medicos99999NaNNaN99999106011302-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3449018001responsabilidad civilprofesionales medicos99999NaNNaN99999102725502-20211NaNNaNNaNNaN500000000.090000000.0NaN0.0NaN65000000.0SOLTEROMASCULINOBOGOTÁ D.C.122.512329
3449118001responsabilidad civildirectores y administradores99999NaNNaN99999102733102-20210NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
3449218001responsabilidad civildirectores y administradores99999NaNNaN99999102742802-20211NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN